The circular coordinates algorithm of de Silva, Morozov, and Vejdemo-Johansson takes as input a dataset together with a cohomology class representing a $1$-dimensional hole in the data; the output is a map from the data into the circle that captures this hole, and that is of minimum energy in a suitable sense. However, when applied to several cohomology classes, the output circle-valued maps can be "geometrically correlated" even if the chosen cohomology classes are linearly independent. It is shown in the original work that less correlated maps can be obtained with suitable integer linear combinations of the cohomology classes, with the linear combinations being chosen by inspection. In this paper, we identify a formal notion of geometric correlation between circle-valued maps which, in the Riemannian manifold case, corresponds to the Dirichlet form, a bilinear form derived from the Dirichlet energy. We describe a systematic procedure for constructing low energy torus-valued maps on data, starting from a set of linearly independent cohomology classes. We showcase our procedure with computational examples. Our main algorithm is based on the Lenstra--Lenstra--Lov\'asz algorithm from computational number theory.
translated by 谷歌翻译
Bias elimination and recent probing studies attempt to remove specific information from embedding spaces. Here it is important to remove as much of the target information as possible, while preserving any other information present. INLP is a popular recent method which removes specific information through iterative nullspace projections. Multiple iterations, however, increase the risk that information other than the target is negatively affected. We introduce two methods that find a single targeted projection: Mean Projection (MP, more efficient) and Tukey Median Projection (TMP, with theoretical guarantees). Our comparison between MP and INLP shows that (1) one MP projection removes linear separability based on the target and (2) MP has less impact on the overall space. Further analysis shows that applying random projections after MP leads to the same overall effects on the embedding space as the multiple projections of INLP. Applying one targeted (MP) projection hence is methodologically cleaner than applying multiple (INLP) projections that introduce random effects.
translated by 谷歌翻译
部分观察到的马尔可夫决策过程(POMDP)是一种强大的框架,用于捕获涉及状态和转换不确定性的决策问题。然而,大多数目前的POMDP规划者不能有效地处理它们经常在现实世界中遇到的非常高的观测(例如,机器人域中的图像观察)。在这项工作中,我们提出了视觉树搜索(VTS),一个学习和规划过程,将生成模型与基于在线模型的POMDP规划的脱机中学到的。 VTS通过利用一组深入生成观测模型来预测和评估蒙特卡罗树搜索计划员的图像观测的可能性,乘坐脱机模型培训和在线规划。我们展示VTS对不同观察噪声的强大稳健,因为它利用在线,基于模型的规划,可以适应不同的奖励结构,而无需重新列车。这种新方法优于基线最先进的策略计划算法,同时使用显着降低的离线培训时间。
translated by 谷歌翻译
所有有损压缩算法采用相似的压缩方案 - 频域变换,然后进行量化和无损编码方案。它们通过量化高频数据来瞄准权衡,以增加以更高图像失真成本的压缩速率。我们提出了一种使用深度学习优化量化表的新方法,并更加精确地测量比以前的方法更准确地测量速率和失真参数(RD)之间的权衡。我们设计了一种卷积神经网络(CNN),其学习以无监督方式在图像块和量化表之间的映射。通过一次处理所有频道的图像,我们可以通过测量不同渠道之间的信息丢失的权衡来实现更强的性能。我们最初在JPEG图像上定位优化,但觉得这可以扩展到任何有损压缩机。
translated by 谷歌翻译
最近的方法表明,直接在大规模图像文本对集合上训练深神网络可以在各种识别任务上进行零拍传输。一个中心问题是如何将其推广到对象检测,这涉及本地化的非语义任务以及分类的语义任务。为了解决这个问题,我们引入了一种视觉嵌入对准方法,该方法将审计模型(例如夹子)(例如夹子)的概括能力传输到像Yolov5这样的对象检测器。我们制定了一个损耗函数,使我们能够将图像和文本嵌入在预审计的模型夹中对齐与检测器的修改语义预测头。通过这种方法,我们能够训练一个对象检测器,该对象检测器可以在可可,ILSVRC和视觉基因组零摄像机检测基准上实现最先进的性能。在推断期间,我们的模型可以适应以检测任何数量的对象类,而无需其他培训。我们还发现,标准对象检测缩放可以很好地传输到我们的方法,并在Yolov5模型和Yolov3模型的各种尺度上找到一致的改进。最后,我们开发了一种自我标记的方法,该方法可提供显着的分数改进,而无需额外的图像或标签。
translated by 谷歌翻译